Parallelization of a finite element surface fitting algorithm for data mining
نویسندگان
چکیده
Amajor task in data mining is to develop automatic techniques to process and to detect patterns in very large data sets. An important data mining technique is multivariate regression, and an essential sub task is the estimation of interaction surfaces, i.e. the estimation of functions of two variables. Thin plate splines provide a very good method to determine an approximating surface. Obtaining standard thin plate splines requires the solution of a dense linear system of equations of order n, where n is the number of observations. Standard thin plate splines may not be practical, because the number of observations for data mining applications is often in the millions. We have developed a finite element approximation of a spline that can handle data sizes with millions of records. The resolution of the finite element method can be chosen independently from the number of observations. The observation data is read from secondary storage once, and does not need to be stored in memory. In this paper, we present a first parallel implementation of this method in an MPI environment. Computer Science Laboratory, RSISE, Australian National University, Canberra, ACT 0200, Australia, [email protected] School of Information Studies, Charles Sturt University, Wagga Wagga, NSW 2678, Australia, [email protected] Computer Science Laboratory, RSISE, Australian National University, Canberra, ACT 0200, Australia, [email protected], [email protected] Department of Mathematics, University of Queensland, St. Lucia, QLD 4072, Australia, [email protected], [email protected]
منابع مشابه
A parallel finite element surface fitting algorithm for data mining
A major task in data mining is to develop automatic techniques to process and to detect patterns in very large data sets. Multivariate regression techniques form the core of many data mining applications. A common assumption is that the multivariate data is well approximated by an additive model involving only first and second order interaction terms. In this case high-dimensional nonparametric...
متن کاملTransient Fluid Flow Modeling in Fractured Aquifer of Sechahoon Iron Mine Using Finite Element Method
Considering the fact that a large volume of iron reserve in the Sechahoon Iron Mine in Yazd Province has located under the water table, it is necessary to conduct a comprehensive study on water flow within the pit and its surroundings. The conceptual model of the aquifer was created using surface and underground geological information compared with water table data of the area of interest. In t...
متن کاملScalable parallel algorithms for surface fitting and data mining
This paper presents scalable parallel algorithms for high dimensional surface fitting and predictive modelling which are used in data mining applications. These algorithms are based on techniques like finite elements, thin plate splines, wavelets and additive models. They all consist of two steps: First, data is read from secondary storage and a linear system is assembled. Secondly, the linear ...
متن کاملDesign of Broaching Tool Using Finite Element Method for Achieving the Lowest Residual Tensile Stress in Machining of Ti6Al4V Alloy
The aim of this study, is to use finite element simulation to achieve the optimal geometry of a broaching tool that creates the lowest tensile stress at the machined surface of the Ti6Al4V alloy. It plays a major role in reducing production costs and improves the surface integrity of the machined parts. The type and amount of residual stress determined by the thermal and mechanical loads transm...
متن کاملNumerical Simulation of Impact of Low Velocity Projectiles With Water Surface
In this article, Finite Element Method (FEM) and Eulerian-Lagrangies Algorithm (ELA) formulation were used to numerically simulate the impact of several low-velocity projectiles with water surface. Material models which were used to express behavior of air and water included Null material model. For the projectiles, rigid material model were applied. Results were validated by analyzing the impa...
متن کامل